Generative Pretrained Transformers - GPT

GPT models were first launched by OpenAI in 2018. All GPT models use the transformer architecture for natural language processing tasks such as: • Language generation • Translation • Question and answering ![[Pasted image 20230219144933.png]] ## Generative Generative refers to the model's ability to **generate new text** based on patterns it has learned from its training data. A generative language model like GPT is capable of producing coherent text in response to a prompt rather than selecting a predefined response. ## Pre-Trained Pre-trained means that the model has **already been trained** on a large amount of text data before it is fine-tuned for specific tasks. This allows it to learn faster and have better results than starting from scratch ## [[Transformers]] The transformer is the architecture used in [[Generative Pretrained Transformers - GPT]] models. It is a type of neural network that has become the gold-standard architecture. Unlike [[Recurrent Neural Networks - RNNs]], a transformer can effectively process long sequences of text without losing information.